BioDIFF: An Effective Fast Change Detection Algorithm for Biological Annotations

نویسندگان

  • Yang Song
  • Sourav S. Bhowmick
  • C. Forbes Dewey
چکیده

Warehousing heterogeneous, dynamic biological data is a key technique for biological data integration as it greatly improves performance. However, it requires complex maintenance procedures to update the warehouse in light of the changes to the sources. Consequently, a key issue to address is how to detect changes to the underlying biological data sources. In this paper, we present an algorithm called BIODIFF for detecting exact changes to biological annotations. In our approach we transform heterogeneous biological data to XML format and then detect changes between two versions of XML representation of biological data. Our algorithm extends X-Diff, a published XML change detection algorithm. X-Diff, being designed for any type of XML data, does not exploit the semantics of biological data to reduce the data set of bipartite mapping. We have implemented BIODIFF in Java. We have conducted an extensive performance study using data from EMBL, GenBank, SwissProt and PDB. Our experimental results show that BIODIFF runs 1.5 to 6 times faster than X-Diff.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fast Approach to the Detection of All-Purpose Hubs in Complex Networks with Chemical Applications

A novel algorithm for the fast detection of hubs in chemical networks is presented. The algorithm identifies a set of nodes in the network as most significant, aimed to be the most effective points of distribution for fast, widespread coverage throughout the system. We show that our hubs have in general greater closeness centrality and betweenness centrality than vertices with maximal degree, w...

متن کامل

BIDEL: An XML-Based System for Effective Fast Change Detection of Genomic and Proteomic Data

A key issue to address in biological data integration is how to detect changes to the underlying biological data sources. In this demonstration, we present a novel system called BIDEL for detecting changes to genomic and proteomic data (sequences and annotations). We transform heterogeneous biological data to XML format (if necessary) and then detect changes between two versions of unordered XM...

متن کامل

Optimized Seizure Detection Algorithm: A Fast Approach for Onset of Epileptic in EEG Signals Using GT Discriminant Analysis and K-NN Classifier

Background: Epilepsy is a severe disorder of the central nervous system that predisposes the person to recurrent seizures. Fifty million people worldwide suffer from epilepsy; after Alzheimer’s and stroke, it is the third widespread nervous disorder.Objective: In this paper, an algorithm to detect the onset of epileptic seizures based on the analysis of brain electrical signals (EEG) has b...

متن کامل

Improving the RX Anomaly Detection Algorithm for Hyperspectral Images using FFT

Anomaly Detection (AD) has recently become an important application of target detection in hyperspectral images. The Reed-Xialoi (RX) is the most widely used AD algorithm that suffers from “small sample size” problem. The best solution for this problem is to use Dimensionality Reduction (DR) techniques as a pre-processing step for RX detector. Using this method not only improves the detection p...

متن کامل

A Fall Detection System based on the Type II Fuzzy Logic and Multi-Objective PSO Algorithm

The Elderly health is an important and noticeable issue; since these people are priceless resources of experience in the society. Elderly adults are more likely to be severely injured or to die following falls. Hence, fast detection of such incidents may even lead to saving the life of the injured person. Several techniques have been proposed lately for the fall detection of people, mostly cate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007